RNN Pytorch源码

Applies a multi-layer Elman RNN with :math:tanh or ReLU non-linearity to an input sequence.
For each element in the input sequence, each layer computes the following function:

$$h_t = \text{tanh}(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh}) $$

where $h_t$ is the hidden state at time t, $x_t$ is the input at time t, and $h_{(t-1)}$ is the hidden state of the previous layer at time $t-1$ or the initial hidden state at time 0. If attr nonlinearity` is relu, then ReLU is used instead of tanh.

  • Inputs: $input$, $h_0$


    input of shape (seq_len, batch, input_size), containing the features of the input sequence. The input can also be a packed variable length sequence. See :func:torch.nn.utils.rnn.pack_padded_sequence or :func:torch.nn.utils.rnn.pack_sequence for details.


$h_0$ of shape (num_layers * num_directions, batch, hidden_size): tensor
containing the initial hidden state for each element in the batch. Defaults to zero if not provided. If the RNN is bidirectional, num_directions should be 2, else it should be 1.

  • Outputs: $output$, $h_n$


    • output of shape (seq_len, batch, num_directions * hidden_size), containing the output features ($h_t$) from the last layer of the RNN, for each $t$. If a class torch.nn.utils.rnn.PackedSequence has been given as the input, the output will also be a packed sequence. For the unpacked case, the directions can be separated using output.view(seq_len, batch, num_directions, hidden_size), with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case.


$h_n$ of shape (num_layers * num_directions, batch, hidden_size) containing the hidden state for $t = seq_{len}$. Like output, the layers can be separated using h_n.view(num_layers, num_directions, batch, hidden_size).

  • note

    All the weights and biases are initialized from : $\mathcal{U}(-\sqrt{k}, \sqrt{k})$
    where $k = \frac{1}{\text{hidden_size}}$

  • batch_first
    If True, then the input and output tensors are provided
    as (batch, seq, feature_size). Default: False
  • Examples
    input_size=10, hidden_size =20, num_layers=2

    rnn = nn.RNN(10, 20, 2) 
    input = torch.randn(5, 3, 10)
    h0 = torch.randn(2, 3, 20)
    output, hn = rnn(input, h0)

Time series representation (时间序列表示方法)


  • (seq_len, batch, input_size)
  • (batch, seq_len, input_size) # batch first